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Abstract 

By noting that a Rasch or 2PL item belongs to the exponential 
family of random variables and that the probability density function 
(pdf) of the correct response (X— 1) and the incorrect response (X=0) 
are symmetric with respect to the vertical line at the item location, it 
is shown that the conjugate prior for ability is proportional to [/(#)] a , 
where 1(9) is the item information and a is a positive constant. When 
the above prior is applied to a 3PL item, the requirement that item 
selection rules are bound to the traditional formula for correction for 
random guessing implies that the constant a mast be 1. Thus, maxi- 
mum information (MI) selection rules for 3PL items are the only rules 
that are consistent with a Bayesian analysis based on the family of 
conjugate priors and with the use of the correction-for-guessing for- 
mula. 



Notes 

A paper presented at the annual meeting of American Educational Re- 
search Association, New Orleans, April 2000. The author’s address is Col- 
lege of Education, Wardlaw 138, University of South Carolina, Columbia, 
SC 29208. Email address: huynh-huynh@sc.edu. 
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Introduction 



Criterion-referenced (CR) measures have been used extensively in the 
United States in the last several years. Such measures, according to Glaser 
(1963) would provide explicit information as to what an individual can do 
or cannot do on a continuum of achievement. Across the years, procedures 
have been developed to provide meaningful interpretations of test scores. 
For tests used in the National Assessment of Educational Progress (NAEP) 
(Beaton & Allen, 1992), for example, CR interpretation is referred to as scale 
anchoring . Criterion-referencing is also the basic concept that underlines 
the CTB Bookmark standard setting procedure (Lewis & Mitzel, 1995; Lewis, 
Mitzel, & Green, 1996) and other procedures of similar nature such as the one 
used in the Maryland School Performance Assessment Program (MSPAP) 
(Westat, 1993, 1994). 

In general, to describe a point on a NAEP achievement continuum (an 
anchor point ) through scale anchoring , a set of items is first selected based 
on some specified statistical criteria. A content expert committee is convened 
to examine the items and then arrive at a general description of the skills 
and performances that are expected from examinees at the anchor point. 
Similarly, the CTB Bookmark standard setting process begins with placing 
all items on the achievement continuum and creating an ordered test form. 
Judges are then asked to place a bookmark at a place in the ordered test 
form that represents their cutoff score for the proficiency level under consid- 
eration (such as the basic, proficient, and advanced levels used in many state 
assessment programs.) Once a cutoff score is finalized, a group of judges are 
asked to look at the items that surround the cutoff and determine the nature 
of the skills associated with this level of achievement. 

Historically, the statistical criteria for selecting items for scale anchoring 
rely on the probability associated with the correct response at various anchor 
points. Consider a binary item and let pl_ x and pf be the proportion of 
correct responses at two successive anchor points 0i_iand 0*. Beaton and 
Allen (1992) indicated that, in some NAEP situations, an item was selected 
to describe anchor point 0* if pf > .80 and p+„ x < .50. In other NAEP 
cases, such as the 1990 mathematics scale anchoring, an item was selected 
if vt ^ -65, pf_ x < .50, and p+— p^_ x > .30. In more recent years (Allen, 
Kline, & Zelanak, 1996, p. 265), NAEP scale anchoring has been based only 
on the rule that specifies that pf > .65 for binary items with no guessing 
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and pf > .74 for multiple-choice items with four options. The proportion of 
.74 can be derived from the proportion of .65 by using the traditional rule 
regarding correction for guessing. In fact, if .65 represents the proportion of 
examinees who know the item (and therefore answer it correctly), then the 
proportion of examinees who do not know the item is .35. For a multiple- 
choice item with four options, traditional correction for guessing stipulates 
that one-fourth of the latter examinees (.35 4- 4 = .09) would guess the item 
correctly. So for multiple-choice items with four options, the threshold value 
of .65 is now raised to .65 + .09 = .74. 

Earlier work on the CTB Bookmark process relies on the statistical rule 
pf = .50 for placing an item without guessing on the achievement continuum 
(Lewis & Mitzel, 1995). The latest version of this standard setting process 
(Lewis et al., 1996) is based on the statistical rule = 2/3 or .67. The 
formula for correction for random guessing is used to adjust these cutoff 
probabilities for multiple-choice items. For an item with four alternatives, 
for example, the cutoff probability of .50 is adjusted to .50 + (.504- 4) = .63. 
As for the cutoff probability of 2/3 or .67 for an item without guessing, the 
adjusted probability is reset at .67 + (.33 4- 4) = .75. 

It may be noted that the NAEP rules for scale anchoring are based largely 
on practical experience and feasibility, and not on any theoretical consider- 
ation. In the context of standard setting, Huynh (1994) points out to the 
need to know what a student can be expected to do or to know at a given 
on the achievement. Using the Bock (1972) partition of the Fisher item in- 
formation of a Rasch binary item to its correct response, Huynh arrived at 
the selection criterion pf > 2/3 or .67 for a binary item without guessing. 
Subsequently, Huynh (1998) extended the work on selection rules to three- 
parameter logistic (3PL) and polytomous items. In the 1998 paper, Huynh 
started with a Bayesian framework with a prior that is proportional to the 
item information. It turns out that this Bayesian approach is equivalent to 
the use of the Bock (1972) partition of the item information to each of its 
categories. For a 3PL item, the maximum information (MI) rule was found 
to be pf > (2 + c)/3 where c is the guessing parameter. Huynh called this 
the “ principal rule." 

This paper extends the work by Huynh (1998) on item selection rules 
for scale anchoring and Bookmark construction of ordered test forms. Its 
major purpose is to further explore the Bayesian method for the same topic. 
Attention will be focussed on the family of conjugate priors in the exponential 
family of probability density functions (pdf) . It will be shown in the last part 
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of the paper that, if the family of conjugate priors for Rasch or 2PL items is 
also used for 3PL items, then the maximum information (MI) principal rule 
presented in Huynh (1998) is the only rule that is consistent with the use of 
the traditional formula for correction for random guessing. 

General Bayesian Framework 

The reader may note that this paper deals only with the characteristics 
of a given item (along with its score categories) in a latent trait setting. 
To frame the problem within a mathematical statistics context, the item 
is treated synonymously as a random variable and its probability density 
function (pdf). A random sample from this random variable is assumed to 
exist. This existence implies the notion of independent repeated testings on 
the same item or identical items and is assumed in conceptualizing and using 
the Fisher information. In addition, the search for an appropriate prior for 
the item is equivalent to the process of placing the item at the ability that is 
most suitable for the item. 

Given the above general remarks, the Bayesian approach used for binary 
items in this paper can also be brought into an empirical Bayes context. 
Given an item (that is defined by its parameters in a latent trait setting), 
two general questions will be asked. 

Question 1: To which population of examinees is the item most suitable? 

Question 2: Among examinees of this population, what is the typical abil- 
ity of those who answer the item correctly? 

Within an empirical Bayes context, answering the first question amounts 
to searching for a prior that is suitable for the item. As for the second 
question, a typical ability is often found among the class of Bayes ability 
estimates associated with the correct response. 

It may be noted that the answer to Question 1 varies with each item. This 
is due to the fact that each item is most suitable only for a certain point on 
the achievement continuum (latent trait). (This is typically the point where 
the item information is maximized.) Therefore, it expected that individual 
items would be assigned different priors in a Bayesian analysis like the one 
explored in this paper. 

Now let r i be the answer to Question 2 and 6 be any point of the achieve- 
ment continuum. We will use the following definition in Huynh (1998, p. 47, 
Definition 2). 

Definition. An examinee with ability 6 is said ”to be expected to answer 
the item correctly” if 0 > T\. 
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To select an item for describing the anchor point it will be assumed 
that examinees at this point have the skills and knowledge beyond what is 
expected of the correct response to the item. The above definition will now 
lead to an item being selected as an anchoring item if 9i > T\. Equivalently, 
let pf and p\ be the proportion of correct responses at 0 t and T\. The item 
selection rule can be stated as follows. 

Rule: Select an item if pf > p\. 

A Review of MI Item Selection 



A major purpose of this paper is to find an appropriate value for p\. Con- 
sider a binary item that follows a three-parameter logistic (3PL) model with 
traditional parameters a, 6, and c. Let P(9) be the probability associated 
with the correct response X = 1 and Q(0) = 1 — P(9) be the probability 
associated with the incorrect response X = 0. The probability P(9) is given 
by the following formula: 



P(d) = c+(i-c) 



exp[a(0 — 6)] 

1 + exp[a(0 — 6)] 



( 1 ) 



For ease of notations, let K = a 2 (l — c) 2 , P = P(9 ), and Q = Q(9). Then 



the item information is known to be equal to 



1(9 ) = K( 1 - P)(P - cf/P. (2) 



Note that the constant K does not depend on either 9 or P. 

Following a suggestion by Bock (1972, Equation 24), Huynh (1998) par- 
titions the total item information 1(9) to each of the two responses X = 0 
and X = 1 according to the probabilities P(9) and Q(9). More specifically 
the information assigned to the correct response X = 1 is taken as 



h(9) = I(9)P(9) = K( 1 - P)(P - c) 2 . (3) 

This function is maximized at the value 9\ = b + (log 2) / a, or equivalently, 
at the value P = p\ = (c + 2)/3. FVom the value p\, Huynh (1998, p. 48) 
stated the following rule for selecting items for scale anchoring. 

Maximum Information Rule: Select an item if pf > (c + 2)/3. 

Thus a binary Rasch or 2PL item without guessing is selected if pf >2/3 
(or .67). As for a multiple-choice 3PL item with four options, the constant c 
may be taken as 1/4 and hence the rule becomes pf > .75. 
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It is noted in the introduction part of this paper that the proportion 
P\ = (c + 2)/3 for a 3PL item can be derived from the proportion of 2/3 
for a 2PL item by using the traditional rule regarding correction for random 
guessing. In fact, if 2/3 represents the proportion of examinees who know the 
item (and therefore answer it correctly), then the proportion of examinees 
who do not know the item is 1 /3. For a 3PL item with k options, the constant 
c may be taken approximately as 1/A:. Let us assume that all examinees 
who do not know the item will randomly guess at the item. Under this 
assumption (of random guessing), the proportion of examinees who guess 
the item correctly is equal to 1/3A; or c/3. Hence, the threshold of 2/3 for a 
2PL item is now raised to 2/3 + c/3 or (2 + c)/3 for a 3PL item. 

Bayesian Framework for Rasch or 2PL Items 

Conjugate prior and Bayes Score Locations 

Consider a binary 2PL item with parameters a and b (and with c = 0). 
Without loss of generality, we will absorb the constant a into the latent trait 
0 and set a = 1 for the rest of this section. The 2PL item now becomes a 
Rasch binary item with difficulty parameter b. The random variable X that 
represents the two responses x = 0, 1 follows the probability density function 
(pdf): 



fx(x | 0) = exp[x(0 - b )]/ {1 + exp[(0 - b )]} . (4) 

We will now search for the family of conjugate priors for 0. To do this, let 
X = (xj, ...,x n ) be a random sample from the random variable X and let its 
likelihood be written as 



fx(x i, —,Xn | 0 ) = exp[(0 - b)^2xi]/ {1 +exp[(0 - 6)]} n . (5) 

*=1 

In the context of psychometric theory, this random sample may be concep- 
tualized as the responses of a given examinee of ability 6 from n independent 
repeated administration of the (same) item or the independent administra- 
tion of n identical items. The random sample can also be thought as the 
responses to the (same) item from a random sample of n examinees with 
identical ability 6 . See Hambleton & Swaminathan (1990; p. 27) for a dis- 
cussion on these interpretations. 
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From the derivations presented in Bernardo and Smith (1994, pp. 266- 
267), it follows that the family of conjugate priors pdf for the Rasch binary 
item will take the general form 

/e(0) = exp[<*(0 - b )]/ {1 + exp[(0 - b)]} p /K (a, 0) (6) 

where a and j3 are any positive constants with a < /?, and K(a,/3) is a 
suitable constant that depends on a and f3. 

It may also be noted that the response X of a Rasch binary item is also 
a Bernoulli random variable with success probability of 

P = exp[(0 - b )]/ {1 + exp[(0 - 6)]} . 

Hence the family of conjugate priors takes the form of the beta function in 
the argument p. This beta pdf is equal to 

fp(p) =P U ~ 1 (1-P) v ~ 1 /B(u,v) (7) 

where u and v are positive constants and B{u,v) is a suitable positive con- 
stant that depends on u and v. By taking note of equation (7) and the 
following partial derivative 

^ = exp[(0 - b )]/ {1 + exp[(0 - 6)]} 2 = p( 1 - p), (8) 

it follows that the pdf of © is given as 

fe(&) = P u ( 1 - P) v /B(u , v ) (9) 

or 

/©(0) = K(u,v) exp[u(0 - b)}/ {1 +exp[(0 - 6)]} (u+u) (10) 

where K(u,v ) is a suitable constant. By taking a = u and (3 = u + v, it 
may be seen that the pdf of © in equation (6) is identical to the pdf of © in 
equation (10). 

As in Huynh (1998), we will take the modal Bayes estimate r x for © to 
be the Bayes score location of the response x. This score location may be 
computed by setting the derivative (with respect to 0) of log P x (0)fe(d) to 
zero. The process yields the score location 

t x = b + log[(a; + a)/(/3+ 1 - x - a)]. (11) 
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Thus the Bayes location of the correct response x = 1 is 



T l = b + \og[(a + l)/(p-a)\. (12) 

Taking into account the identity log[exp(z)] = z, it may be verified that at 
this Bayes location, the probability of getting the correct response is 

Pl = (a + 1)/(J3 + 1). (13) 

Symmetric prior and item information function 

It may be noted that the probabilities fx(X = 0 | 6) and fx(X = 1 | 6) 
of the incorrect (x = 0) and correct response (x = 1) to the Rasch item are 
symmetric with respect to the (vertical) line 6 = b. Therefore, a priori, it 
may make sense to treat these responses “equally” by requiring that the prior 
of 0 be symmetric with respect to the line 6 = b. This condition is equivalent 
to the requirement that the Bayes score locations of the incorrect and correct 
responses are symmetric with respect to the item location b. This symmetry 
requirement is fulfilled when j3 = 2a. The symmetric conjugate prior pdf for 
0 now takes the special form 

fe(0) = {exp(0 - b)/[ 1 + exp(0 - &)] 2 } Q / K(a , 2a). (14) 

With the item information being 

1(6) = exp (6 — b)/[ 1 + exp(0 — 6)] 2 , (15) 

the symmetric conjugate prior for © may be written as 

/»(«)- [/(«))•/*■(«, 2o). (16) 

General Empirical Bayes Rule 

With this symmetric prior, the Bayes location of the correct response is 

ri = b + log[(a + l)/a)]. (17) 

Taking into account the identity log[exp(z)] = z , it may be verified that at 
this Bayes location, the probability of getting the correct response is 

Pi = (a + l)/(2a + 1). (18) 
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Thus the general empirical Bayes rule can be stated as follows. 

General Empirical Bayes Rule : Select an item if pf > (a + l)/(2o; + 1), 
where a is a positive constant. 

With a being positive, the quantity (cc + l)/(2o; + 1) is larger than .50. 
Thus the general empirical Bayes rule specifies that an item is selected if the 
probability of answering correctly is larger than .50. The remainder of this 
section addresses two special cases regarding the parameter a. 

When ce = this prior becomes a member of the family of noninforma- 
tive priors that were proposed and studied by Jeffreys (1939, 1948, 1961). 
Methods for constructing Jeffreys’ priors (and other non informative or refer- 
ence priors) may be found in Berger and Bernardo (1992), Lehmann (1983, 
p. 241), and Schervish (1995, pp. 121-123). With Jeffreys’ prior, the Bayes 
score location for the correct response is r i = b + log 3. At this location, the 
pi probability is 3/4 (or .75). 

The other special case is for ot = 1. This corresponds to the prior that was 
considered by Huynh (1998) as a way to introduce the Bock partition of item 
information (Bock, 1972, Equation 24) to each of the two responses x = 0 
and x = 1. The Bayes score location for the correct response is Tj = b + log 2 
and the pi probability is 2/3 (or .67). 

Bayesian Framework for 3PL Items 

Consider now a 3PL item with parameters a, 6, and c and with item 
information given as 



1(6) = K( 1 -P)(P- cf/P. (19) 

As in the case of Rasch or 2PL items, we will consider the family of priors 
that are proportional to [I(6)] a where a is a positive constant. The Bayes 
score location for the correct response is the value of 6 at which the function 
[7(0)]“P is maximized. Since P is an increasing function in 0, maximization 
may be accomplished by finding the value of P at which the derivative (with 
respect to P) of log{ [/ (6)] a P} is equal to zero. Thus P is a solution of the 
equation 



«[- 



1 1 1 1 

+ P-c P' + P 



0 . 



( 20 ) 



Algebraic manipulations yields the following quadratic equation for P. 
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(2a + 1 )P 2 — (a + c + 1 )P + c(l — a) = 0. (21) 

The function on the left side of this equation is negative at P — c and 
positive at P = 1. Therefore, the above quadratic equation has only one 
solution between c and 1. Without undue difficulty, it may also verified that 
this solution maximizes the function [. I{9)\ a P . Thus the Bayes score location 
for the correct response is the value 9 at which the probability of the correct 
response is equal to 

* _ a + c + 1 + [(a + c + l) 2 — 4c(2a + 1)(1 — a)] 1 / 2 
Pl= 2(2a + l) 

As an illustration, consider a 3PL item with c = .25. With Jeffreys’ prior 
(a = .5), the quadratic equation becomes 

2P 2 _1.75P+. 125 = 0. (23) 

This equation yields the solution P = p\ = 0.80. (The other solution P = .08 
is smaller than the value of c and, therefore, is not acceptable.) As noted 
in the last section, for a Rasch or 2PL item, the threshold probability for 
item selection is p\ = .75. Thus, under Jeffreys’ prior, a Rasch or 2PL item is 
selected for anchor point 9, if p+ > .75. As for a 3PL item with four options 
(and with c = .25), the selection rule is pf > .80. It may be verified that the 
threshold value for p+ (namely .80) for a 3PL item (with guessing) cannot 
be deduced from the value .75 for a 2PL item (without guessing) from the 
formula for correction for random guessing 

The next section will investigate the condition under which the formula 
for correction for guessing can be used to relate these two threshold values 
for pf . 




Conditions Under Which Formula 
for Correction for Random Guessing Can Be Used 

Now for a 2PL item, the p\ probability was found (Equation 18) in the 
previous section to be 

Pi = (a + l)/(2a + l). (24) 

Under correction for random guessing, the value p\ (Equation 22) is related 
to the value pi via the formula 
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In other word, we have 



P\ =Pi + c(l ~Pi). 



(25) 



p\ = [a(l + c) + l]/(2<* + l). (26) 

Replacing this value of p\ in equations (20) or (21) and after some straight- 
forward algebraic manipulations, the following equation will be obtained. 

c{2ot + l) 2 (c* - 1) = 0. (27) 

This equation is satisfied for all values of c if and only if a = 1. This value 
for a corresponds to the prior considered in Huynh (1998) as a precursor to 
the analysis of score locations of 3PL items based on the Bock partition of 
the item information. 



Summary ; 

This paper extends the work by Huynh (1994, 1998) on the topic of se- 
lecting items for scale anchoring and criterion-referenced interpretation. It 
focuses on a Bayesian analysis based on the family of conjugate priors for 
Rasch and 2PL items. More specifically, it is be shown that if this family 
of conjugate priors is also used for 3PL items, then the maximum informa- 
tion (MI) (principal) rule presented in Huynh (1998) is the only rule that is 
consistent with the use of the traditional formula for correction for random 
guessing. 
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